BIC selection procedures in mixed effects models

نویسندگان

  • Maud Delattre
  • Marc Lavielle
  • Marie-Anne Poursat
چکیده

We consider the problem of variable selection in general nonlinear mixed-e ets models, including mixed-e ects hidden Markov models. These models are used extensively in the study of repeated measurements and longitudinal analysis. We propose a Bayesian Information Criterion (BIC) that is appropriate for nonstandard situations where both the number of subjects N and the number of measurements per subject n tend to in nity. In this case, the consistency rates of the maximum likelihood estimators (MLE) of the parameters depend on the level of variability designed in the model. We show that the MLE of the population parameters related to subject-speci c parameters are √ N -consistent whereas the MLE of the parameters related to xed parameters are √ Nn-consistent. We derive a BIC criterion with a penalty based on two terms proportional to logN and logNn. Finite-sample properties of the proposed selection procedure are investigated by simulation studies. Key-words: Consistency rate, Nonlinear mixed model, Hidden Markov mixed-e ects model, Variable selection. ∗ Laboratoire de Mathématiques, Université Paris-Sud, France & Popix, Inria Saclay Ile-de-France ha l-0 06 96 43 5, v er si on 1 11 M ay 2 01 2 Procédures de sélection de variables de type BIC dans les modèles à e ets mixtes Résumé : Nous nous intéressons au problème de la sélection de variables dans des modèles non-linéaires mixtes généraux, incluant les modèles de Markov cachés à e ets mixtes. Ces modèles sont très utilisés pour analyser des données répétées ou des données longitudinales. Nous proposons un critère BIC (Bayesian Information Criterion) adapté à la situation non-standard de double-asymptotique où le nombre de sujets N et le nombre d'observations par sujet n tendent vers l'in ni. Dans cette situation, les vitesses de convergence des estimateurs du maximum de vraisemblance (EMV) des paramètres dépendent des niveaux de variabilité exprimés dans le modèle. Nous montrons que les EMV des paramètres de population liés aux paramètres spéci ques à chaque sujet sont √ N -convergents tandis que les EMV des paramètres liés aux paramètres sans composante aléatoire sont √ Nn-convergents. Nous en déduisons un critère BIC dont la pénalité est formée de deux termes en logN et logNn. Nous illustrons le comportement de la méthode de sélection de variables proposée par une étude de simulations. Mots-clés : Modèle de Markov caché à e ets mixtes, Modèle non-linéaire mixte, Sélection de variables, Vitesses de convergence. ha l-0 06 96 43 5, v er si on 1 11 M ay 2 01 2 BIC selection procedures in mixed e ects models 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A note on BIC in mixed-effects models

The Bayesian Information Criterion (BIC) is widely used for variable selection in mixed effects models. However, its expression is unclear in typical situations of mixed effects models, where simple definition of the sample size is not meaningful. We derive an appropriate BIC expression that is consistent with the random effect structure of the mixed effects model. We illustrate the behavior of...

متن کامل

Regression with Multiple Candidate Models: Selecting or Mixing?

Model averaging provides an alternative to model selection. An algorithm ARM rooted in information theory is proposed to combine di erent regression models/methods. A simulation is conducted in the context of linear regression to compare its performance with familiar model selection criteria AIC and BIC, and also with some Bayesian model averaging (BMA) methods. The simulation suggests the foll...

متن کامل

Bayesian information criterion for longitudinal and clustered data.

When a number of models are fit to the same data set, one method of choosing the 'best' model is to select the model for which Akaike's information criterion (AIC) is lowest. AIC applies when maximum likelihood is used to estimate the unknown parameters in the model. The value of -2 log likelihood for each model fit is penalized by adding twice the number of estimated parameters. The number of ...

متن کامل

Model Selection in Linear Mixed Models

Linear mixed effects models are highly flexible in handling a broad range of data types and are therefore widely used in applications. A key part in the analysis of data is model selection, which often aims to choose a parsimonious model with other desirable properties from a possibly very large set of candidate statistical models. Over the last 5–10 years the literature on model selection in l...

متن کامل

Model selection strategies for identifying most relevant covariates in homoscedastic linear models

We propose a new method in two variations for the identification of most relevant covariates in linear models with homoscedastic errors. In contrast to AIC, BIC and other information criteria, our method is based on an interpretable scaled quantity. This quantity measures a maximal relative error one makes by selecting covariates from a given set of all available covariates. The proposed model ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012